perm filename THESIS[F75,JMC]13 blob sn#620771 filedate 1981-10-25 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00006 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	.require "memo.pub[let,jmc]" source
C00012 00003	THEORY OF ARTIFICIAL INTELLIGENCE
C00017 00004	MATHEMATICAL THEORY OF COMPUTATION
C00025 00005	OTHER RESEARCH IDEAS
C00027 00006	.<<references>>
C00029 ENDMK
C⊗;
.require "memo.pub[let,jmc]" source
.AT "qb" ⊂"%8T%*"⊃
.cb "NOTES ON RESEARCH TOPICS, ESPECIALLY FOR THESES"


	I am prepared to supervise research in the following topics,
although I am also willing to consider other topics proposed by
students.  Deciding which of these topics would lead to PhD theses will
require work by the student.  In many of them, the student would have
to contribute substantially to the definition of the problem as
well as to finding a solution.

SYSTEM PROGRAMMING

	It would be hard to persuade me to supervise the design of
a new programming language or design an operating system as
a thesis project.  There would really have to be a new idea
and not merely a dissatisfaction with the details of present
implementations.  (Note of February, 16 1979: I have a new idea -
see below).  Instead I would prefer to see research in
new areas such as the following:

.item←0
#. Information description.  Many files exist that
are accessible to our computer over the ARPAnet or by
telephone.  Develop a language for describing a class of
such files that would permit a program to answer a question
by extracting the information from such a file.  Easy (I hope)
example:  many ARPAnet sites have telephone directories in
their computer as do we.  These directories all have approximately
the same information but have different file formats.
Devise a way of describing such files so that a single program
can find addresses and telephone numbers from any of the files
using the description of that file.

	The description should also permit putting information
into the file.  Preferably it should be possible to describe the
file partially, so that if the M.I.T. phone file normally has
some weird entry like our BAM allocation, we wouldn't have to
know about it in order to read or change a file or to create a
new one.  In the last case, there would probably have to be a
default given.

	It is important to understand that we are talking about
describing existing files that other people have created according
to their own tastes and not a shiny new file system that would
conquer the world.

	Many files are not directly readable and modifiable.  Instead
they are parts of information systems designed for direct human
use.  In order that a computer could extract information from or
put information into such a file, it would need a description
of the system of interaction.  One such file, accessible over the
ARPAnet is the Lawrence Berkeley Lab file of census data.
A file description system that would permit programs to answer questions
about this file once it has been described would be a substantial
achievement.  To play fair, the description system should be designed
before the designer sees the manual for the LBL data base.

#. (February 1979).  Design a "history oriented programming language".
The object of a high level programming language is to allow the
programmer to express his procedure in terms close to those in which
it is convenient for him to think.  We often think in terms of
the past, e.g. %2Do not generate castling as a legal move if the
king or rook has moved%1.  %2If the driver has had 5 speeding convictions
in the past 2 years, start proceedings to revoke his license%1.
Many other examples can (and should) be given (by the student) in
which the idea of the program involves the use of information about
past events.  These events can have occurred in the outside world
or they may be events occurring in the operation of the program
itself.

	In all present programming languages, writing a program
depending on the past requires the programmer to decide what information
has to be collected and how it is to be stored.  Much labor would
be saved if the program could refer directly to past events, %2and the
compiler would decide what information had to be obtained and saved
and how it should be represented%1.

	One approach to the problem would be to consider a virtual
program that stored a record of all events that occurred - all
input and output and all assignments - in a history array.  Programs
could refer to this array, but couldn't change it.  With present
memory sizes, direct implementation of this is impossible, so the
compiler would have to decide which information was actually referred
to and store only that.  (To stretch the imagination a bit, recall that
there are 6%8x%110%523%1 molecules in a mole, and a complete history
of the actions of the KL-10 would store only of the order of 10%56%1
words per second.

	A start on the design of such a programming language is described
in ELEPHA[S79,JMC], but making a compiler for ELEPHANT would be a good
thesis project, because it would have to be a heuristic program.

#. Super SETL.  SETL is Jack Schwartz's set oriented programming language.
I like the goals of SETL, because allowing the programmer to work directly
with set theoretic operations is indeed a powerful and high level tool.
However, I think the SETL project is too limited in two respects.  First
it has a standard way of representing sets, but the way sets should
be represented depends strongly on how they are to be used.  Sets can
be represented as bit vectors, lists, hash tables, balanced trees,
programs for testing elements of larger sets, program for successively
generating elements of the given set, and programs for getting elements
of the set by operations on other sets.  A high level system would
provide for all these ways of representing sets, allow the program
to choose how particular sets should be represented or make the decision
itself if asked.

	The second issue is a bit vaguer.  Often what we need to represent
is information about as set rather than a full description of the set
itself.  Deciding what information and how it might be represented will require
the exercise of some inventiveness.

#. Implement core war. See COREWA[S77,JMC].

#. Business communication language.  See CBCL[f75,jmc].
.skip 2

THEORY OF ARTIFICIAL INTELLIGENCE

#. Circumscribing a collection of sentences
	See me for an oral spiel, but first read CIRCUM.NEW[S79,JMC].

#. Finitization.  A possible scheme for showing certain sentences
satisfiable by translating them into theories in which satisfiability
means satisfiability in models of a fixed size.  See finiti[f75,jmc].

#. Formalization of knowledge.  Read (McCarthy 1959), (McCarthy
and Hayes 1969), (McCarthy 1977a), (McCarthy 1977b) and (McCarthy 1979)
and see me.  Also see KNOW.ART[F75,JMC] and KNOW[E78,JMC].

#. Formalization of the heuristic situation.  Consider a resolution
theorem prover and ask how we could describe its heuristic situation
formally.  For example, there might be sentences saying that it can't
prove ⊗clause1 without another clause containing the predicate symbol
⊗P and resolvable with ⊗clause2. 

#. Formalization of concepts.  Read same references as above and see me.

#. Formalization of blocks world.  Progress in problem solving requires
programs that distinguish between the existence (in a mathematical sense)
of a configuration meeting the specifications and its constructibility.
Thus an engineer normally designs a bridge and then figures out how
to build it, e.g. decides where to build scaffolding first.  The problem
solving systems that I know about design by plannning to build, and this
is too limited.  There is an informal writeup of some of these considerations,
but see me first.

#. Formalization of concurrent and continuous action.  A key barrier to
progress in AI is that current systems deal mainly with discrete events
that result in a definite new situation rather than with events occurring
concurrently and continuously.  There is some relation with the formalisms
for parallel programming, but AI must emphasize the handling of partial
information about systems.  There is a chess problem in Berliner's thesis
that provides an entry.

#. Heuristic programming.  Because it is difficult to treat the physical
world, it would be interesting to make an intelligent program for some
domain of human activity in which the human inputs and outputs are
purely symbolic and which gives good scope to creativity.  Perhaps
trading, i.e. mmking commercial deals, is such an activity.

#. The plots of operas and plays provide innumerable situations
in which a character plans to achieve a certain result.  Formalize
facts and enough common sense knowledge to show how he could
expect his plan to succeed.  When the plan doesn't succeed, it
is usually because of a circumstance he has not taken into account.
See if this can be interpreted as a circumscription or other piece
of non-monotonic reasoning.

#. The notion of pattern and pattern directed computation is susceptible
of useful generalization beyond what has heretofore been used in
computer science and AI.  PATTER.AI[F75,JMC] and PATTER[F77,JMC]
contain some notes that may be suggestive.

#. Formalize quantification over circumstances.  "Is there anything
else wrong with the boat".  "There are two things wrong with the boat."
.skip 2

MATHEMATICAL THEORY OF COMPUTATION

#. Proving intensional properties of programs

	A property of a program is extensional if it is a property
of the function computed by the program, i.e. a property of the
input-output relation.  Intensional properties include the time
and space used, the number of recursive calls, etc.  Our approach
to intensional properties is to make them extensional properties
of related or "derived" programs.  Ben Moszkowski has done some of
this, and some techniques are given in McCarthy and Talcott
LISP: Programming and Proving.  We have concentrated on functional
programs.  When the program is the fixed point of a computable
functional, then some intensional properties of the program are
extensional properties of the functional.  Which properties are
extensional properties of the functional remain to be seen.
Conjecture:  If the function is strict, then the number of recursions
required to evaluate the function for given arguments is an
extensional property of the program.  Indeed the set of evaluations
should be an extensional property as should a certain evaluation
tree.  The sequence of evaluations will not be an extensional
property of the functional.

#. Blobs

	A blob is a piece of flowchart with many entries and exits.
They can be glued together to make larger blobs and are characterized
by certain functions.  It would be interesting to know if the
Yanov decision procedures are consequences of functional manipulation
of blob functions.  See the file
BLOB[W76,JMC] 19-Feb-77	THE BLOB FUNCTIONS OF FLOW CHARTS
for more information.

#. First order theory of "cons should not evaluate its arguments"

	In a well-known report Daniel Friedman and 
David Wise argue that cons should be programmed as an FSUBR that
merely conses its formal arguments without evaluating them.  This
enlarges the LISP data structures to include objects like the
list of all integers.  Their evaluator won't evaluate it when it
occurs in a computation but merely expands as required
when tests or printing make it necessary.  Friedman and Wise suggested
to me that part of the extension required to treat these objects
logically is to make %2cons(qb,qb) ≠ qb, cons(%1A%2,qb) ≠ qb%1.
This isn't all, by any means.  Obviously, the S-expressions have to
be extended to include some computable infinite S-expressions.  Can
this still be done in first order logic, and what kind of a theory
do we get?  Is it good for anything or just a curiosity?

#. Representation of data for speed in computation

	I have two examples:

	First, the state graph of the "Tower of Hanoi" problem can be
nicely drawn in the plane, but there are many crossing lines.  It
can be "untwisted" to remove all the crossing lines, and the new
version has the following property: Suppose one wants to go from
one vertex to another by the shortest path.  One need only go from
the vertex one is at along the edge that makes the least angle with
direction to the goal.  Thus the global problem of the shortest path
reduces, with the aid of this representation of the graph in the plane,
to a local calculation.  It is unknown by any of the data representation
experts I have consulted whether an arbitrary finite graph can be
represented by a set of points in a Euclidean space of suitable
dimension so that the shortest path can always be found just by "going
in the right direction".  There is no reason to believe the conjecture
that it is always possible, but when it is possible, it leads to a very
easy way of finding one's way around the graph.

	Second, consider a partial ordering on a finite set ⊗A 
or more specially a tree.
Can we map ⊗f the elements of ⊗A into a suitable Euclidean space so that
%2x_<_y%1 is equivalent to ⊗f(y) being in the cone with vertex at
⊗f(x) and making a certain angle with a fixed direction.
Given such a map, represented (say) by a hash table, computation
of the partial ordering becomes very easy.  When the partial ordering
it a tree, I would conjecture that the map always exists.

	Both the above examples reduce the time-complexity of a problem
by pre-computing a space-linear structure.  The more general problem
it to make a computation of some relations or functions on a set easy
by determining a suitable 1-1 mapping of the set into a space with
an additional structure (Euclidean in the above cases) that makes
the computation easier.

	Representing subsets of a rather small set
by bit vectors makes the Boolean set operations
easy and makes corresponding operations easy on any other set which
can be 1-1 mapped into a set of sufficiently small sets.

OTHER RESEARCH IDEAS

	The following computer files contain some research ideas.

SEQUEN[F80,JMC]

A vision problem -1972

	The main question that AI needs psychological help in investigating
is - how do people represent information so that certain tasks can be done.

	Consider the representation of faces so that they can be recognized.
Suppose we take a collection of girls, all blonds with long straight hair
or some other very common type.  How do people tell them apart in the
absence of obvious special features like warts which we shall assume
absent.  Presumably by the shape of the face.  Query: Can we make a
program that by expanding and contracting parts of the face can transform
a picture of one girl into one that will be mistaken for another.  The
thesis would be to try to do this.
.<<references>>
SKIP 2
.bb REFERENCES

%3McCarthy, John (1959)%1: "Programs with Common Sense". %2Mechanisation of Thought
Processes, Volume I%1.  London:HMSO.

%3McCarthy, J. and Hayes, P.J. (1969)%1: "Some Philosophical Problems from
the Standpoint of Artificial Intelligence". %2Machine Intelligence 4%1,
pp. 463-502 (eds Meltzer, B. and Michie, D.). Edinburgh: Edinburgh
University Press.

%3McCarthy, John (1977a)%1: "Epistemological Problems of Artificial
Intelligence", in Proceedings of Fifth International Conference on
Artificial Intelligence%1, available from Department of Computer Science,
Carnegie-Mellon University, Pittsburgh, PA 15213, USA.

%3McCarthy, John%1 (1977b), %2Circumscription - A Way of Jumping to Conclusions%1,
Stanford Artificial Intelligence Laboratory, (to be published).

%3McCarthy, John (1979)%1: "First Order Theories of Individual
Concepts and Propositions", in Michie, Donald (ed.)
%2Machine Intelligence 10%1, also available on disk at SU-AI as
CONCEP[E76,JMC].

%3Moore, Robert (1979)%1: his M.I.T. PhD thesis.

This file is THESIS[F75,JMC] at SU-AI.

.turn on "→"
John McCarthy→{date}